Proceedings of the 9th Workshop on Asian Language Resources Collocated with Ijcnlp 2011 We Wish to Thank Our Sponsors

نویسندگان

  • Rachel Edita O. Roxas
  • Sarmad Hussain
  • Rowena Cristina Guevara
چکیده

Language resources are really much required for understanding and modeling the language in the present approaches. The language that has a rich language resource gains a big benefit in making a big advance in language processing. On the other hand, the less resource language is struggling with preparing a large enough language resource such as raw text or annotated corpora. It is a labor intensive and time consuming task. Moreover, computerization of the text is another non-trivial effort. There needs a supportive computing environment in inputting, encoding, retrieving, analysis, etc.. Learning from the rich resource languages, we gradually collecting the resource and preparing the necessary tools. Through many efforts in the recent years, we can see some significant outcomes from PAN localization project (2004-2007, 2007-2101, http://www.panl10n.net/), ADD (2006-2010, http://www.tcllab.org/), Asian WordNet (http://asianwordnet.org/), Hindi WordNet (http://www.cfilt.iitb.ac.in/wordnet/webhwn/), BEST (since 2009, Thai Word Segmentation Software Contest, http://thailang.nectec.or.th/ best/) and many NLP summer schools. The activities gain a big potential in leveraging the NLP tools development and research personnel development. It results in a big growth of Asian language resource development and research. With the spirit of sharing on social networking, the resources can efficiently be developed to a satisfied amount in a reasonable time scale. Asian WordNet is an example of developing a set of 13 languages of Wordnet connected via Princeton WordNet. Thai WordNet is open for online collaborative development. About 70K synsets and 80K words of Thai WordNet are available online. ThaiLao conversion is an approach to exhibit the advantage in utilization of language similarity to increase the other language resource. Lao WordNet is created by converting from Thai WordNet by using the phoneme transfer approach. Taking the advantage of language similarity, the language corpus can be obtained by a quick conversion rule. In this case, the study of direct transfer is much more efficient than creating from the scratch. Currently, most of the above mentioned results are open to public for at least research purpose. However, more and more language resources are still needed to improve the language processing. The possible of online collaborative development and sharing is a key factor in the language resource development.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proposal for the International Standard Language Resource Number

In this paper, we propose a new identifier scheme for Language Resources to provide Language Resources with unique names using a standardised nomenclature. This will also ensure Language Resources to be identified, and consequently to be recognised as proper references in activities within Human Language Technologies as well as in documents and scientific papers.

متن کامل

A Method Towards the Fully Automatic Merging of Lexical Resources

Lexical Resources are a critical component for Natural Language Processing applications. However, the high cost of comparing and merging different resources has been a bottleneck to obtain richer resources and a broader range of potential uses for a significant number of languages. With the objective of reducing cost by eliminating human intervention, we present a new method towards the automat...

متن کامل

The Language Library: Many Layers, More Knowledge

In this paper we outline the general concept of the Language Library, a new initiative that has the purpose of building a huge archive of structured colletion of linguistic information. The Language Library is conceived as a community built repository and as an environment that allows language specialists to share multidimensional and multi-level annotated/processed resources. The first steps t...

متن کامل

Author Response to Comment on "Aluminum Phosphide Poisoning: A Case Series in North Iran"

Dear Editor; I wish to thank the authors for their careful comment (1), on our short report entitled “Aluminum Phosphide Poisoning: A Case Series in North Iran” (2). Unfortunately, as we were excited to display the findings of our series, some facts were missed from the methods of the article. The results presented in our series was collected from eight patients with definitive aluminum phosp...

متن کامل

Prospects for an Ontology-Grounded Language Service Infrastructure

Servicization of language resources (LR) and technologies (LT) on an appropriately designed and adequately operated infrastructure is a promising solution for sharing them effectively and efficiently. Given this rationale, this position paper reviews relevant attempts around the Language Grid, and presents prospects for an ontologygrounded language service infrastructure. As the associated issu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011